-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable PUT checksums #320
Enable PUT checksums #320
Conversation
#[tokio::test] | ||
async fn test_put_checksums() { | ||
const PART_SIZE: usize = 5 * 1024 * 1024; | ||
let (bucket, prefix) = get_test_bucket_and_prefix("test_put_object_abort"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let (bucket, prefix) = get_test_bucket_and_prefix("test_put_object_abort"); | |
let (bucket, prefix) = get_test_bucket_and_prefix("test_put_checksums"); |
impl Default for PutObjectParams { | ||
fn default() -> Self { | ||
Self { | ||
trailing_checksums: true, | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be on by default here -- as a generic S3 client it should be opt-in. But we should turn it on by default in mountpoint_s3::Uploader
.
#[non_exhaustive] | ||
pub struct PutObjectParams {} | ||
pub struct PutObjectParams { | ||
pub trailing_checksums: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a doc comment (specifically that it's crc32c).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, maybe this should be pub checksum_config: Option<ChecksumConfig>
and we re-export mountpoint_s3_crt::ChecksumConfig
from this crate, so that we don't end up duplicating stuff if we add more options here? I don't feel strongly about it either way, especially since the CRT combines PUT and GET checksum config into the same struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but it's unclear to me whether we want to expose ChecksumConfig in the long run (vs something tailored to specific requests: PutObject, MPU, GetObject).
I'd rather keep it as a simple bool for now and think about a more complete API when/if we'll introduce more options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably second having an enum/struct over boolean. Something like Option<ChecksumAlgorithm>
, even if ChecksumAlgorithm
will only be CRC32C
today?
I'm thinking the API we're putting here matches up to this one in Java SDK: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/PutObjectRequest.Builder.html#checksumAlgorithm(software.amazon.awssdk.services.s3.model.ChecksumAlgorithm)
We don't have to re-export the CRT config struct, we could just use our own in mountpoint-s3-client
if it makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dannycjones, we could start adding a ChecksumAlgorithm
enum, but what about the location (trailer vs header)? That's part of ChecksumConfig
but (I believe) only trailer is supported on multi-part uploads (which is what put_object
uses). Do we want to limit the API to what a specific request allows, or just error on misconfiguration? Related: will we have a separate put_object
for non-mpu? Or maybe that would also be configurable?
I don't think we need to answer any of these questions now, so my preference is for the simplest possible option, knowing we will replace it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably prefer to keep it simple for now, so I'm ok with using boolean. Maybe we open a new issue to review this configuration before we publish a new version of mountpoint-s3-client? I think that's the point where we say there will be no breaking change after this and you can safely use this new config.
mountpoint-s3-crt/src/s3/client.rs
Outdated
} | ||
|
||
/// Get out the inner pointer to the checksum config | ||
pub fn to_inner_ptr(&self) -> *const aws_s3_checksum_config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn to_inner_ptr(&self) -> *const aws_s3_checksum_config { | |
pub(crate) fn to_inner_ptr(&self) -> *const aws_s3_checksum_config { |
mountpoint-s3-crt/src/s3/client.rs
Outdated
location: aws_s3_checksum_location::AWS_SCL_TRAILER, | ||
checksum_algorithm: aws_s3_checksum_algorithm::AWS_SCA_CRC32C, | ||
validate_response_checksum: false, | ||
validate_checksum_algorithms: std::ptr::null_mut(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think?
location: aws_s3_checksum_location::AWS_SCL_TRAILER, | |
checksum_algorithm: aws_s3_checksum_algorithm::AWS_SCA_CRC32C, | |
validate_response_checksum: false, | |
validate_checksum_algorithms: std::ptr::null_mut(), | |
location: aws_s3_checksum_location::AWS_SCL_TRAILER, | |
checksum_algorithm: aws_s3_checksum_algorithm::AWS_SCA_CRC32C, | |
..Default::default() |
mountpoint-s3-crt/src/s3/client.rs
Outdated
} | ||
|
||
impl ChecksumConfig { | ||
/// Create a new [ChecksumConfig] with Crc32c traling checksums. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Create a new [ChecksumConfig] with Crc32c traling checksums. | |
/// Create a new [ChecksumConfig] with Crc32c trailing checksums. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, note that this is only for PUT. It's weird (but understandable) that the CRT combines checksums for PUT and GET into this one struct, but here we are.
Signed-off-by: Alessandro Passaro <[email protected]>
Description of change
Enable Crc32c checksums when uploading objects. PUT requests now have an option to enable trailing checksums.
Does this change impact existing behavior?
N/A
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).